Fake videos represent an important misinformation threat. While existing forensic networks have demonstrated strong performance on image forgeries, recent results reported on the Adobe VideoSham dataset show that these networks fail to identify fake content in videos. In this paper, we propose a new network that is able to detect and localize a wide variety of video forgeries and manipulations. To overcome challenges that existing networks face when analyzing videos, our network utilizes both forensic embeddings to capture traces left by manipulation, context embeddings to exploit forensic traces' conditional dependencies upon local scene content, and spatial attention provided by a deep, transformer-based attention mechanism. We create several new video forgery datasets and use these, along with publicly available data, to experimentally evaluate our network's performance. These results show that our proposed network is able to identify a diverse set of video forgeries, including those not encountered during training. Furthermore, our results reinforce recent findings that image forensic networks largely fail to identify fake content in videos.
translated by 谷歌翻译
视觉上现实的gan生成图像最近已成为重要的错误信息威胁。研究表明,这些合成图像包含取证痕迹,可以通过法医检测器易于识别。不幸的是,这些探测器建立在神经网络上,这些神经网络容易受到最近开发的对抗性攻击的影响。在本文中,我们提出了一种新的抗飞行攻击,能够欺骗gan生成的图像探测器。我们的攻击使用经过对抗训练的发电机来合成这些检测器与真实图像相关联的迹线。此外,我们提出了一种训练攻击的技术,以便它可以达到可转让性,即可能会欺骗未知的CNN,即没有明确训练它。我们通过一系列广泛的实验评估了我们的攻击,我们表明我们的攻击可以用使用七个不同的gans创建的合成图像欺骗八个最先进的检测CNN,并且超过其他替代攻击。
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
Supervised machine learning-based medical image computing applications necessitate expert label curation, while unlabelled image data might be relatively abundant. Active learning methods aim to prioritise a subset of available image data for expert annotation, for label-efficient model training. We develop a controller neural network that measures priority of images in a sequence of batches, as in batch-mode active learning, for multi-class segmentation tasks. The controller is optimised by rewarding positive task-specific performance gain, within a Markov decision process (MDP) environment that also optimises the task predictor. In this work, the task predictor is a segmentation network. A meta-reinforcement learning algorithm is proposed with multiple MDPs, such that the pre-trained controller can be adapted to a new MDP that contains data from different institutes and/or requires segmentation of different organs or structures within the abdomen. We present experimental results using multiple CT datasets from more than one thousand patients, with segmentation tasks of nine different abdominal organs, to demonstrate the efficacy of the learnt prioritisation controller function and its cross-institute and cross-organ adaptability. We show that the proposed adaptable prioritisation metric yields converging segmentation accuracy for the novel class of kidney, unseen in training, using between approximately 40\% to 60\% of labels otherwise required with other heuristic or random prioritisation metrics. For clinical datasets of limited size, the proposed adaptable prioritisation offers a performance improvement of 22.6\% and 10.2\% in Dice score, for tasks of kidney and liver vessel segmentation, respectively, compared to random prioritisation and alternative active sampling strategies.
translated by 谷歌翻译
Data-driven modeling approaches such as jump tables are promising techniques to model populations of resistive random-access memory (ReRAM) or other emerging memory devices for hardware neural network simulations. As these tables rely on data interpolation, this work explores the open questions about their fidelity in relation to the stochastic device behavior they model. We study how various jump table device models impact the attained network performance estimates, a concept we define as modeling bias. Two methods of jump table device modeling, binning and Optuna-optimized binning, are explored using synthetic data with known distributions for benchmarking purposes, as well as experimental data obtained from TiOx ReRAM devices. Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably particularly at low number of points in the device dataset, sometimes over-promising, sometimes under-promising target network accuracy. This paper also proposes device level metrics that indicate similar trends with the modeling bias metric at the network level. The proposed approach opens the possibility for future investigations into statistical device models with better performance, as well as experimentally verified modeling bias in different in-memory computing and neural network architectures.
translated by 谷歌翻译
Three-dimensional (3D) freehand ultrasound (US) reconstruction without a tracker can be advantageous over its two-dimensional or tracked counterparts in many clinical applications. In this paper, we propose to estimate 3D spatial transformation between US frames from both past and future 2D images, using feed-forward and recurrent neural networks (RNNs). With the temporally available frames, a further multi-task learning algorithm is proposed to utilise a large number of auxiliary transformation-predicting tasks between them. Using more than 40,000 US frames acquired from 228 scans on 38 forearms of 19 volunteers in a volunteer study, the hold-out test performance is quantified by frame prediction accuracy, volume reconstruction overlap, accumulated tracking error and final drift, based on ground-truth from an optical tracker. The results show the importance of modelling the temporal-spatially correlated input frames as well as output transformations, with further improvement owing to additional past and/or future frames. The best performing model was associated with predicting transformation between moderately-spaced frames, with an interval of less than ten frames at 20 frames per second (fps). Little benefit was observed by adding frames more than one second away from the predicted transformation, with or without LSTM-based RNNs. Interestingly, with the proposed approach, explicit within-sequence loss that encourages consistency in composing transformations or minimises accumulated error may no longer be required. The implementation code and volunteer data will be made publicly available ensuring reproducibility and further research.
translated by 谷歌翻译
倾斜的随机生存森林(RSF)是一种用于右翼结果的合奏监督学习方法。斜RSF中的树是使用预测变量的线性组合生长的,以创建分支,而在标准RSF中,使用单个预测变量。倾斜的RSF集合通常比标准RSF合奏具有更高的预测准确性。但是,评估预测变量的所有可能的线性组合会诱导大量的计算开销,从而将应用限制为大规模数据集。此外,几乎没有开发用于解释斜RSF合奏的方法,与基于轴的对应物相比,它们仍然难以解释。我们介绍了一种提高斜力RSF计算效率的方法,以及一种用斜RSF估计单个预测变量重要性的方法。我们减少计算开销的策略是利用牛顿 - 拉夫森评分(Newton-Raphson)评分,这是一种经典的优化技术,我们适用于决策树的每个非叶子节点内的COX部分似然函数。我们通过在线性组合中否定了用于给定预测指标的每个系数,然后计算出降低的降低准确性,从而估计单个预测因子对斜RSF的重要性。通常,在基准测试实验中,我们发现,与现有的斜RSF相比,与现有软件相比,我们对斜RSF的实现速度约为450倍,而较高的Brier得分则要高450倍。我们在模拟研究中发现,“否定重要性”比置换重要性,莎普利添加性解释和先前引入的技术更可靠地区分相关和无关的预测因子,以基于方差分析来衡量斜RSF的可变重要性。当前研究中引入的方法可在AORSF R软件包中获得。
translated by 谷歌翻译
在这项工作中,我们考虑了成对的跨模式图像注册的任务,这可能会受益于仅利用培训时间可用的其他图像,而这些图像从与注册的图像不同。例如,我们专注于对准主体内的多参数磁共振(MPMR)图像,在T2加权(T2W)扫描和具有高B值(DWI $ _ {high-b} $)的T2加权(T2W)扫描和扩散加权扫描之间。为了在MPMR图像中应用局部性肿瘤,由于相应的功能的可用性,因此认为具有零B值(DWI $ _ {B = 0} $)的扩散扫描被认为更易于注册到T2W。我们使用仅训练成像模态DWI $ _ {b = 0} $从特权模式算法中提出了学习,以支持具有挑战性的多模式注册问题。我们根据356名前列腺癌患者的369组3D多参数MRI图像提出了实验结果图像对,与注册前7.96毫米相比。结果还表明,与经典的迭代算法和其他具有/没有其他方式的经典基于测试的基于学习的方法相比,提出的基于学习的注册网络具有可比或更高准确性的有效注册。这些比较的算法也未能在此具有挑战性的应用中产生DWI $ _ {High-B} $和T2W之间的任何明显改进的对齐。
translated by 谷歌翻译
如果可疑的术前磁共振(MR)图像在超声引导的活检手术过程中,在超声引导的活检程序中,在临床上具有重要意义的前列腺癌有更好的机会进行采样。但是,活检程序的诊断准确性受到操作员依赖性技能和取样目标的经验的限制,这是一个顺序决策过程,涉及导航超声探针并为潜在的多个目标放置一系列采样针。这项工作旨在学习强化学习(RL)政策,以优化2D超声视图和活检针相对于指导模板的连续定位的行为,以便可以有效地进行MR目标进行有效且充分的采样。我们首先将任务作为马尔可夫决策过程(MDP)制定,并构建一个环境,该环境可以根据其解剖结构和从MR图像得出的病变来实际上为个别患者执行靶向动作。因此,在每次活检程序之前,可以通过奖励MDP环境中的阳性采样来优化患者特定的政策。五十四名前列腺癌患者的实验结果表明,拟议的RL学习政策的平均命中率为93%,平均癌症核心长度为11 mm,与人类设计的两种替代基线策略相比,没有手工设计奖励直接最大化这些临床相关指标。也许更有趣的是,发现RL代理商学习了适应病变大小的策略,在该病变大小上,针对小病变的针头的扩散优先考虑。此类策略以前尚未在临床实践中报告或普遍采用,而是与直观设计的策略相比,导致了总体上的靶向性能。
translated by 谷歌翻译
本文总结并评估了追求人工智能(AI)系统公平性的各种方法,方法和技术。它检查了这些措施的优点和缺点,并提出了定义,测量和防止AI偏见的实际准则。特别是,它警告了一些简单而常见的方法来评估AI系统中的偏见,并提供更复杂和有效的替代方法。该论文还通过在高影响力AI系统的不同利益相关者之间提供通用语言来解决该领域的广泛争议和困惑。它描述了涉及AI公平的各种权衡,并提供了平衡它们的实用建议。它提供了评估公平目标成本和收益的技术,并定义了人类判断在设定这些目标中的作用。本文为AI从业者,组织领导者和政策制定者提供了讨论和指南,以及针对更多技术受众的其他材料的各种链接。提供了许多现实世界的例子,以从实际角度阐明概念,挑战和建议。
translated by 谷歌翻译